Exploiting Secondary Sources for Automatic Object Consolidation

نویسندگان

  • Martin Michalowski
  • Snehal Thakkar
  • Craig A. Knoblock
چکیده

Information sources on the web are controlled by different organizations or people, utilize different text formats, and have varying inconsistencies. Therefore, any system that integrates information from different data sources must consolidate data from these sources. Data from many data sources on the web may not contain enough information to accurately consolidate the data even using state of the art object consolidation systems. We present an approach to accurately and automatically consolidate data from various data sources by utilizing a state of the art object consolidation system in conjunction with a mediator system. The mediator system is able to automatically determine which secondary sources need to be queried in cases where the object consolidation system is unable to confidently determine whether two records refer to the same entity. In turn, the object consolidation system is then able to utilize this additional information to improve the accuracy of the consolidation between datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Object-Oriented Method for Automatic Extraction of Road from High Resolution Satellite Images

As the information carried in a high spatial resolution image is not represented by single pixels but by meaningful image objects, which include the association of multiple pixels and their mutual relations, the object based method has become one of the most commonly used strategies for the processing of high resolution imagery. This processing comprises two fundamental and critical steps towar...

متن کامل

Performing Object Consolidation on the Semantic Web Data Graph

An important aspect of Semantic Web technologies is the issue of identity and uniquely identifying resources, which is essential for integrating data across sources. Currently, there is poor agreement on the use of common URIs for the same instances across sources and as a result a naively integrated dataset might miss associations between resources. To solve the problem, we present a method fo...

متن کامل

Design of a Data Warehouse over Object-Oriented and Dynamically Evolving Data Sources

In this paper we present some of the results achieved while realizing an international research project aiming at the design and development of an Object-Relational Data Warehousing System (ORDAWA). The most important goals of the project are as follows: the development of techniques for the integration and consolidation of different external data sources in an object-relational data warehouse,...

متن کامل

Designing an Object–Relational Data Warehousing

In this paper we present a research project aiming at the design and development of an Object–Relational Data Warehousing System (ORDAWA). The project is conducted in co–operation of Institute for Informatics–Systems, Klagenfurt University, and Institute of Computing Science, Poznań University of Technology. Important goals of the project are to develop techniques for the integration and consol...

متن کامل

MOMA - A Mapping-based Object Matching System

Object matching or object consolidation is a crucial task for data integration and data cleaning. It addresses the problem of identifying object instances in data sources referring to the same real world entity. We propose a flexible framework called MOMA for mapping based object matching. It allows the construction of match workflows combining the results of several matcher algorithms on both ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003